Final Project: Time Series Forecasting with LSTMs, Neural Networks Eng. Class
prof. Victor H. Alves Ribeiro, PPGEPS/PUCPR
Note
Chamada do exercicio em ingles vai aqui
Abstract
This project aims to predict the returns of a selected portfolio of agricultural commodities using Long Short-Term Memory (LSTM) neural networks. The study includes ablation experiments to understand the impact of different architectural choices on model performance. The results demonstrate the importance of hyperparameter tuning and regularization in time series forecasting tasks.
Introduction
The use of Artificial Neural Networks (ANNs) for time series forecasting has gained significant traction in recent years. This project focuses on utilizing LSTMs, a type of recurrent neural network capable of handling sequential data efficiently, to predict financial returns of agricultural commodities. A detailed ablation study is conducted to explore various architectural configurations and hyperparameters.
Literature Review
Time series forecasting is critical in finance and economics. Traditional models, such as ARIMA and exponential smoothing, have limitations when dealing with non-linear and complex data. Recent studies emphasize the robustness of LSTM models in capturing temporal dependencies. This project builds on existing research by applying LSTM networks to a unique dataset of agricultural commodity returns.
…see the full paper for this complete section …
Methods
Collecting commodity price data.
Calculating logarithmic returns.
Normalizing the data.
Training LSTM models with different configurations.
Performing grid search to optimize hyperparameters.
Conducting residual analysis to identify uncaptured patterns and issues like autocorrelation or heteroscedasticity.
Results discussion
Data Collection and Preprocessing
Python libs
Code
# Importing necessary librariesimport pandas as pdimport numpy as npimport yfinance as yffrom sklearn.preprocessing import MinMaxScalerfrom sklearn.metrics import mean_squared_errorimport matplotlib.pyplot as pltfrom tensorflow.keras.models import Sequentialfrom tensorflow.keras.layers import LSTM, Densefrom tensorflow.keras.optimizers import Adamfrom sklearn.model_selection import ParameterGridimport psutilimport timeimport jsonimport osimport warningswarnings.filterwarnings('ignore')# Additional libraries for residual analysisfrom statsmodels.stats.diagnostic import acorr_ljungbox, het_breuschpaganimport statsmodels.api as sm
First of all, we need to check the hardware availability:
Code
# Collecting hardware informationdef get_system_info(): system_info = {'CPU_cores': psutil.cpu_count(logical=True),'CPU_freq_MHz': psutil.cpu_freq().current,'Total_RAM_GB': round(psutil.virtual_memory().total / (1024**3), 2),'Available_RAM_GB': round(psutil.virtual_memory().available / (1024**3), 2),'GPU_info': 'Not available'# Placeholder, can be expanded with libraries like GPUtil }return system_infosystem_info = get_system_info()print("System Information:", system_info)
Plotting the time series of prices and returns side by side (2 per row)
Code
# Create a directory for plots if it doesn't existplots_dir ='plots'ifnot os.path.exists(plots_dir): os.makedirs(plots_dir)# Plot pricesprint("\nPlotting time series of prices...")num_cols =2# Number of plots per rownum_plots =len(data.columns)num_rows = (num_plots + num_cols -1) // num_cols # Ensure enough rowsfig, axs = plt.subplots(num_rows, num_cols, figsize=(15, 5* num_rows))axs = axs.flatten()for i, col inenumerate(data.columns): axs[i].plot(data.index, data[col]) axs[i].set_title(f'Price Series - {col}') axs[i].set_xlabel('Date') axs[i].set_ylabel('Price')# Hide unused subplotsfor j inrange(i +1, len(axs)): fig.delaxes(axs[j])plt.tight_layout()plt.savefig(os.path.join(plots_dir, 'price_series.png'))plt.show()plt.close()# Plot returnsprint("Plotting time series of returns...")num_plots_ret =len(returns.columns)num_rows_ret = (num_plots_ret + num_cols -1) // num_colsfig, axs = plt.subplots(num_rows_ret, num_cols, figsize=(15, 5* num_rows_ret))axs = axs.flatten()for i, col inenumerate(returns.columns): axs[i].plot(returns.index, returns[col]) axs[i].set_title(f'Return Series - {col}') axs[i].set_xlabel('Date') axs[i].set_ylabel('Log Return')# Hide unused subplotsfor j inrange(i +1, len(axs)): fig.delaxes(axs[j])plt.tight_layout()plt.savefig(os.path.join(plots_dir, 'return_series.png'))plt.show()plt.close()
Plotting time series of prices...
Plotting time series of returns...
Preprocessing data for LSTM time series modelling:
Code
# Function to prepare data for LSTMdef prepare_data(series, time_steps): X, y = [], []for i inrange(len(series) - time_steps): X.append(series[i:(i + time_steps)]) y.append(series[i + time_steps])return np.array(X), np.array(y)
Setting the parameters:
Code
# Defining parameterstime_steps =5# Number of time stepsepochs =10# Reduced epochs for faster execution during testing# Dictionaries to store resultsmodels = {}histories = {}mse_results = {}scalers = {}predictions = {}best_params_dict = {}residuals_analysis = {}# Directory to save reports and graphsreport_dir ='report'ifnot os.path.exists(report_dir): os.makedirs(report_dir)
LSTM time series model fitting
Code
# Loop through each time seriesfor col in returns.columns:print(f"\nProcessing column: {col}") series = returns[col].values.reshape(-1, 1)# Check if series is emptyiflen(series) ==0:print(f"Series {col} is empty after preprocessing. Skipping.")continueprint(f"Series {col} has {len(series)} data points.")# Normalizing data scaler = MinMaxScaler(feature_range=(0, 1)) series_scaled = scaler.fit_transform(series) scalers[col] = scaler # Storing the scaler for later inversion# Preparing data X, y = prepare_data(series_scaled, time_steps)# Check if X and y are non-emptyif X.shape[0] ==0:print(f"Not enough data points in {col} after preparation. Skipping.")continue# Splitting into training and test sets split_index =int(0.8*len(X)) X_train_full, X_test = X[:split_index], X[split_index:] y_train_full, y_test = y[:split_index], y[split_index:] X_train_full = X_train_full.reshape((X_train_full.shape[0], X_train_full.shape[1], 1)) X_test = X_test.reshape((X_test.shape[0], X_test.shape[1], 1))# Hyperparameter grid for Grid Search param_grid = {'neurons': [30, 50],'learning_rate': [0.001, 0.01],'activation': ['tanh', 'relu'],'batch_size': [32, 64] } grid = ParameterGrid(param_grid)# Initializing variables to store best results best_mse =float('inf') best_params =None best_model =None# Performing Grid Searchprint(f"Performing Grid Search for {col}...")for params in grid: model = Sequential() model.add(LSTM(params['neurons'], activation=params['activation'], input_shape=(time_steps, 1))) model.add(Dense(1)) optimizer = Adam(learning_rate=params['learning_rate']) model.compile(optimizer=optimizer, loss='mean_squared_error') history = model.fit( X_train_full, y_train_full, validation_data=(X_test, y_test), epochs=epochs, batch_size=params['batch_size'], verbose=0 ) y_pred = model.predict(X_test) y_pred_inv = scaler.inverse_transform(y_pred) y_test_inv = scaler.inverse_transform(y_test.reshape(-1, 1)) mse = mean_squared_error(y_test_inv, y_pred_inv)if mse < best_mse: best_mse = mse best_params = params best_model = model best_y_pred = y_pred best_params_dict[col] = best_paramsprint(f"Best parameters for {col}: {best_params} with MSE: {best_mse}") models[col] = best_model predictions[col] = {'Best Model': best_y_pred}# Inverting the normalization y_test_inv = scaler.inverse_transform(y_test.reshape(-1, 1)) y_pred_inv = scaler.inverse_transform(best_y_pred)# Calculating MSE mse_results[col] = {'Best Model': best_mse}# Visualization of results plt.figure(figsize=(10, 4)) plt.plot(y_test_inv, label='Actual Value') plt.plot(y_pred_inv, label='Prediction') plt.title(f'Prediction vs Actual - {col} - Best Model') plt.legend() plt.savefig(os.path.join(report_dir, f'pred_vs_actual_{col}_Best_Model.png')) plt.close()# Residual Analysis residuals = y_test_inv - y_pred_inv# Plotting residuals plt.figure(figsize=(10, 4)) plt.plot(residuals, label='Residuals') plt.title(f'Residuals - {col} - Best Model') plt.legend() plt.savefig(os.path.join(report_dir, f'residuals_{col}_Best_Model.png')) plt.close()# Ljung-Box test for autocorrelation in residuals lb_test = acorr_ljungbox(residuals, lags=[10], return_df=True) lb_pvalue = lb_test['lb_pvalue'].values[0]# Plotting residuals ACF fig, ax = plt.subplots(figsize=(10, 4)) sm.graphics.tsa.plot_acf(residuals.squeeze(), lags=40, ax=ax) plt.title(f'Residuals Autocorrelation Function - {col}') plt.savefig(os.path.join(report_dir, f'acf_residuals_{col}_Best_Model.png')) plt.close()# Heteroscedasticity test (Breusch-Pagan Test) exog = sm.add_constant(best_model.predict(X_test)) test_bp = het_breuschpagan(residuals, exog) bp_pvalue = test_bp[3]# Convert p-values to Python float lb_pvalue =float(lb_pvalue) bp_pvalue =float(bp_pvalue)# Saving statistical test results residuals_analysis[col] = {'residuals': residuals.flatten().tolist(),'ljung_box_pvalue': lb_pvalue,'breusch_pagan_pvalue': bp_pvalue }print(f"Residual Analysis for {col}:")print(f"Ljung-Box Test p-value: {lb_pvalue}")print(f"Breusch-Pagan Test p-value: {bp_pvalue}")# Displaying final results in a tableprint("\nFinal Results:")results_table = pd.DataFrame(mse_results)print(results_table)
# Saving results to a CSV fileresults_table.to_csv(os.path.join(report_dir, 'mse_results_updated.csv'), index=True)# Saving the best parameters foundwithopen(os.path.join(report_dir, 'best_params.json'), 'w') as f: json.dump(best_params_dict, f, indent=4)# Saving the residual analysiswithopen(os.path.join(report_dir, 'residuals_analysis.json'), 'w') as f: json.dump(residuals_analysis, f, indent=4)
Ploting the MSEs for each time series:
Code
# Report: Documenting the results# Plotting the MSEs for each time seriesfor col in mse_results.keys(): mse_series = mse_results[col] plt.figure(figsize=(10, 5)) plt.bar(mse_series.keys(), mse_series.values(), color='blue') plt.title(f'MSE Comparison - {col}') plt.ylabel('MSE') plt.xticks(rotation=45) plt.tight_layout() plt.savefig(os.path.join(report_dir, f'mse_comparison_{col}.png')) plt.close()
Saving system info:
Code
# End timerend_time = datetime.now()elapsed_time = end_time - start_time # This is a timedelta objectprint(f"Total execution time: {elapsed_time}")# Save execution time to the reportsystem_info['Execution_Time_seconds'] = elapsed_time.total_seconds() # Convert to float for JSONwithopen(os.path.join(report_dir, 'system_info.json'), 'w') as f: json.dump(system_info, f, indent=4)
Total execution time: 0:05:04.961364
Generating an automatic final report:
Code
# Final Report: Generating a text document with the resultsreport_path = os.path.join(report_dir, 'final_report.txt')withopen(report_path, 'w') as report_file: report_file.write("Final Project Report - Forecasting Commodity Returns with LSTM\n") report_file.write("="*80+"\n\n") report_file.write("1. Project Objectives:\n") report_file.write("Forecast future returns of a commodity portfolio using LSTM Neural Networks.\n\n") report_file.write("2. Methodology:\n") report_file.write("- Collecting commodity price data.\n") report_file.write("- Calculating logarithmic returns.\n") report_file.write("- Normalizing the data.\n") report_file.write("- Training LSTM models with different configurations.\n") report_file.write("- Performing grid search to optimize hyperparameters.\n") report_file.write("- Conducting residual analysis to identify uncaptured patterns and issues like autocorrelation or heteroscedasticity.\n\n") report_file.write("3. Results:\n") report_file.write(results_table.to_string()) report_file.write("\n\n") report_file.write("4. Best Parameters Found (Grid Search):\n") report_file.write(json.dumps(best_params_dict, indent=4)) report_file.write("\n\n") report_file.write("5. Residual Analysis:\n")for col, res in residuals_analysis.items(): report_file.write(f"Residual Analysis for {col}:\n") report_file.write(f"Ljung-Box Test p-value: {res['ljung_box_pvalue']}\n") report_file.write(f"Breusch-Pagan Test p-value: {res['breusch_pagan_pvalue']}\n\n") report_file.write("\n") report_file.write("6. Conclusions:\n") report_file.write("The study demonstrated the importance of proper hyperparameter selection and model architecture for forecasting financial returns. Regularization techniques and the choice of activation function significantly influenced model performance. The residual analysis highlighted the need to consider autocorrelation and heteroscedasticity in modeling financial time series.\n\n") report_file.write("7. Recommendations for Future Work:\n") report_file.write("- Implement additional regularization techniques, such as DropConnect or Batch Normalization.\n") report_file.write("- Explore more advanced architectures, like GRU or bidirectional models.\n") report_file.write("- Increase the dataset to improve the models' generalization capacity.\n") report_file.write("- Use more robust cross-validation methods to assess model stability.\n") report_file.write("- Integrate other features, such as technical indicators or macroeconomic variables, to enrich model inputs.\n") report_file.write("- Consider hybrid models that combine Machine Learning techniques with traditional statistical models.\n") report_file.write("\nSystem Information and Execution Time:\n") report_file.write(json.dumps(system_info, indent=4)) report_file.write("\n\n") report_file.write("End of Report.\n")
Results and Discussion
We begin reading the stored results by MSEs:
Code
# Reading the stored results# 1. Reading the MSE results file# Define the report directoryreport_dir ='report'# Path to the MSE results filemse_results_path = os.path.join(report_dir, 'mse_results_updated.csv')# Read the CSV filemse_results = pd.read_csv(mse_results_path, index_col=0)# Display the DataFrameprint("\nMSE Results of the models:")print(mse_results)
MSE Results of the models:
GF=F KE=F ZC=F ZL=F ZM=F ZR=F \
Best Model 0.000109 0.000383 0.000304 0.000369 0.000331 0.089758
ZS=F ZW=F
Best Model 0.000153 0.000401
Then we go to the best hyperparametes file for each time series univariate modelling:
Code
# 2. Reading the best parameters file# Path to the best parameters filebest_params_path = os.path.join(report_dir, 'best_params.json')# Read the JSON filewithopen(best_params_path, 'r') as f: best_params = json.load(f)# Display the best parametersprint("\nBest parameters found for each time series:")for series, params in best_params.items():print(f"{series}: {params}")
# 3. Reading the final report# Path to the final reportreport_path = os.path.join(report_dir, 'final_report.txt')# Read the reportwithopen(report_path, 'r') as report_file: report_content = report_file.read()# Display the reportprint("\nFinal Report Content:")print(report_content)
Final Report Content:
Final Project Report - Forecasting Commodity Returns with LSTM
================================================================================
1. Project Objectives:
Forecast future returns of a commodity portfolio using LSTM Neural Networks.
2. Methodology:
- Collecting commodity price data.
- Calculating logarithmic returns.
- Normalizing the data.
- Training LSTM models with different configurations.
- Performing grid search to optimize hyperparameters.
- Conducting residual analysis to identify uncaptured patterns and issues like autocorrelation or heteroscedasticity.
3. Results:
GF=F KE=F ZC=F ZL=F ZM=F ZR=F ZS=F ZW=F
Best Model 0.000109 0.000383 0.000304 0.000369 0.000331 0.089758 0.000153 0.000401
4. Best Parameters Found (Grid Search):
{
"GF=F": {
"activation": "relu",
"batch_size": 32,
"learning_rate": 0.01,
"neurons": 50
},
"KE=F": {
"activation": "tanh",
"batch_size": 32,
"learning_rate": 0.01,
"neurons": 30
},
"ZC=F": {
"activation": "tanh",
"batch_size": 32,
"learning_rate": 0.01,
"neurons": 50
},
"ZL=F": {
"activation": "relu",
"batch_size": 32,
"learning_rate": 0.001,
"neurons": 50
},
"ZM=F": {
"activation": "tanh",
"batch_size": 32,
"learning_rate": 0.01,
"neurons": 50
},
"ZR=F": {
"activation": "tanh",
"batch_size": 64,
"learning_rate": 0.001,
"neurons": 30
},
"ZS=F": {
"activation": "tanh",
"batch_size": 32,
"learning_rate": 0.01,
"neurons": 50
},
"ZW=F": {
"activation": "relu",
"batch_size": 32,
"learning_rate": 0.01,
"neurons": 50
}
}
5. Residual Analysis:
Residual Analysis for GF=F:
Ljung-Box Test p-value: 0.84832154632904
Breusch-Pagan Test p-value: 0.5140222285887335
Residual Analysis for KE=F:
Ljung-Box Test p-value: 0.019409670595273834
Breusch-Pagan Test p-value: 0.9428560812228698
Residual Analysis for ZC=F:
Ljung-Box Test p-value: 2.437562706640178e-08
Breusch-Pagan Test p-value: 0.5371942621682879
Residual Analysis for ZL=F:
Ljung-Box Test p-value: 0.2123773524884235
Breusch-Pagan Test p-value: 0.005194206956572999
Residual Analysis for ZM=F:
Ljung-Box Test p-value: 0.15055923115189182
Breusch-Pagan Test p-value: 0.30865864081407496
Residual Analysis for ZR=F:
Ljung-Box Test p-value: 8.462962086132558e-20
Breusch-Pagan Test p-value: 3.5052259181695084e-05
Residual Analysis for ZS=F:
Ljung-Box Test p-value: 0.11057863621707233
Breusch-Pagan Test p-value: 0.1363534385641425
Residual Analysis for ZW=F:
Ljung-Box Test p-value: 0.027056318872086236
Breusch-Pagan Test p-value: 0.9418644044525549
6. Conclusions:
The study demonstrated the importance of proper hyperparameter selection and model architecture for forecasting financial returns. Regularization techniques and the choice of activation function significantly influenced model performance. The residual analysis highlighted the need to consider autocorrelation and heteroscedasticity in modeling financial time series.
7. Recommendations for Future Work:
- Implement additional regularization techniques, such as DropConnect or Batch Normalization.
- Explore more advanced architectures, like GRU or bidirectional models.
- Increase the dataset to improve the models' generalization capacity.
- Use more robust cross-validation methods to assess model stability.
- Integrate other features, such as technical indicators or macroeconomic variables, to enrich model inputs.
- Consider hybrid models that combine Machine Learning techniques with traditional statistical models.
System Information and Execution Time:
{
"CPU_cores": 12,
"CPU_freq_MHz": 1800.0,
"Total_RAM_GB": 31.69,
"Available_RAM_GB": 16.79,
"GPU_info": "Not available",
"Execution_Time_seconds": 304.961364
}
End of Report.
At the end we read the graphs for MSEs:
Code
# 4. Viewing the graphsfrom IPython.display import Image, display# List of time seriesseries_list = returns.columns# Display MSE comparison graphsfor col in series_list: image_path = os.path.join(report_dir, f'mse_comparison_{col}.png')if os.path.exists(image_path): display(Image(filename=image_path))else:print(f"Graph {image_path} not found.")# Display residuals graphsfor col in series_list: residuals_image_path = os.path.join(report_dir, f'residuals_{col}_Best_Model.png') acf_image_path = os.path.join(report_dir, f'acf_residuals_{col}_Best_Model.png')if os.path.exists(residuals_image_path): display(Image(filename=residuals_image_path))else:print(f"Graph {residuals_image_path} not found.")if os.path.exists(acf_image_path): display(Image(filename=acf_image_path))else:print(f"Graph {acf_image_path} not found.")# End of code
Conclusions
References
Code
##| eval: false# Total timing to compile this Quarto documentend_time = datetime.now()time_diff = end_time - start_timeprint(f"Total Quarto document compiling time: {time_diff}")
Total Quarto document compiling time: 0:05:05.435859